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Abstract — The cutoff rate Ro(W) of a discrete memoryless 
channel (DMC) W is often used as a figure of merit, alongside 
the channel capacity C(W). Given a channel W consisting 
of two possibly correlated subchannels Wi, W2, the capacity 
function always satisfies C(Wi) + C{W-2) < C(W), while there 
are examples for which Ro(Wi) + R (W 2 ) > Ro{W). This 
fact that cutoff rate can be "created" by channel splitting was 
noticed by Massey in his study of an optical modulation system 
modeled as a A/'ary erasure channel. This paper demonstrates 
that similar gains in cutoff rate can be achieved for general 
DMC's by methods of channel combining and splitting. Relation 
of the proposed method to Pinsker's early work on cutoff rate 
improvement and to Imai-Hirakawa multi-level coding are also 
discussed. 

I. Introduction 

Let W be a DMC with input alphabet X, output alphabet 
y, and transition probabilities W(y\x). Let Q be a probability 
distribution on X, and define the functions 

1 i+p 



E (p,Q,W) = -log^ 



Y J Q{x)w{y\ 



X) ! + " 



y \_ x 

where p > (all logarithms are to the base 2 throughout), and 

E r (R, Q, W) = max [E (p, Q, W) - pR] 

0<p<l 

where R > 0. The random-coding exponent is given by 

E r (R, W) = maxi; r (i?, Q, W) 
Q 

Gallager [1, Theorem 5.6.2] shows that the probability of ML 
(maximum-likelihood) decoding error P e over a (N, 2 NR , Q) 
block code ensemble is upperbounded by 2~ NEri - R 'Q' W \ A 
(TV, 2^, Q) block code ensemble is one where each letter of 
each codeword is chosen independently from distribution Q. 
Gallager shows that the exponent E r (R, W) is positive for 
all rates < R < C, where C is the channel capacity. The 
channel cutoff rate is defined as Rq(W) = maxQ Eq(1, Q, W) 
and equals the random coding exponent at rate R = 0, i.e. 
Ro{W) =E r (0,W). 

Gallager's "parallel channels theorem" [1, p. 149] states that 

E (p, Wi ® W 2 ) = E (p, W x ) + E {p, W 2 ) 

where W\ : X\ — » y± and W 2 : X 2 — * y 2 are any two DMC's, 
Wi ® W 2 denotes a DMC W : X\ x X 2 — > y x x y 2 with tran- 
sition probabilities W(yi, y 2 \xi, x 2 ) = W^y^xijW^y^x^ 



for all {xi,x 2 ) e Xi x X 2 and (2/1,1/2) G ^1 x y 2 . This 
theorem implies that E (p, W® n ) = nE (p,W) and hence 
E r (nR, W® n ) = nE r (R,W). This is a single-letterization 
result stating that the random-coding exponent cannot be im- 
proved by considering ensembles where codewords are made 
up of super-symbols chosen from an arbitrary distribution Q n 
on blocks of ?? channel inputs. 

A. Massey 's example 

The independence of channels W\ and W 2 is crucial in the 
parallel channels theorem; if they are correlated then equality 
may fail either way. Massey [2] made use of this fact to gain a 
coding advantage in the context of an optical communication 
system. Massey's idea is illustrated in the following example; 
this same example was also discussed in [3]. 

Example 1 (Massey [2]): Consider the quaternary erasure 
channel (QEC), W : X x x X 2 -> ^i x y 2 where Xi = X 2 = 
{0,1}, yi=y 2 = {0,1,?}, and 

1-e, 



W{yiy 2 \xix 2 ) 



2/12/2 = 2:1^2 
2/12/2 =?? 



where < e < 1 is the erasure probability. The QEC W 
can be decomposed into two BEC's (binary erasure channels): 
Wi : Xi — > yi, i = 1,2. In this decomposition, a transition 
(xi ,x 2 ) — > (2/1 , 2/2) over the QEC is viewed as two transitions, 
xi — > 2/1 an d x 2 — > y 2 , taking place on the respective 
component channels, with 

1-e, Vi=Xi 



W l (y i \x i ) = 



These BEC's are fully correlated in the sense that an erasure 
occurs either in both or in none. 

Humblet [4] gives the random-coding exponent for the 
M'ary erasure channel (MEC) as follows. 

D(l- TZ M-\\ e ), R C <R<C 



E r (R,MEC) = 



log M I 

Ro — R, 



< R < R r 



(1) 



where D(S\\e) = 6\og{S/e) + (1 - 5) log [(1 - 6)/(l - e)], 
C = (1-e) log M is the capacity, R c = C/[l + (M - l)e] is 
the critical rate, and Rq = log 71/ — log[l + (M — l)e] is the 
cutoff rate. Fig.n snows the random-coding exponents for the 
QEC and the BEC with e = 0.25. It is seen from the figure 
that 



E r (R, W) < E r {R/2, Wi) + E r (R/2, W 2 ) 



(2) 



In fact for rates R > R C {W) = 2(1 — e)/(l+3e), the exponent 
is doubled by splitting: E r (R/2,W 1 ) + E r (R/2,W 2 ) = 
2E r (R,W). Also, C(W) = C(Wi) + C(W 2 ), i.e., the 
capacity of the QEC is not degraded by splitting it into BEC's. 
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Fig. 1. Random-coding exponents for QEC and BEC. 

Instead of direct coding of the QEC W, Massey suggested 
applying independent encoding of the component BECs W\ 
and Wi, ignoring the correlation between the two channels. 
The second alternative presents significant advantages with 
respect to (i) reliability-complexity tradeoff in ML decoding, 
and (ii) the cutoff-rate criterion. 

Reliability-complexity tradeoff. Consider block coding on 
the QEC using a (N, 2 NR , Q) code ensemble where Q is 
uniform, so that E r (R,W) = E r {R,Q,W) for all R. The 
ML decoding complexity \ i s proportional to the number 
of codewords, \ — 2 NR . The reliability is given by P e = 

2-NE r (R,W) 

Next, consider ML decoding over the two subchannels Wi 
and W 2 , using independent (2N, 2 NR , Q 1 ) ensembles, where 
Q' is uniform. Then, £ r (i?,BEC) = E r (R, Q',BEC), and the 
ML complexity and reliability figures are X1+X2 — 2 NR and 
Pe,i +P e ,2 = 2- 2NE rW 2 > BEC \ Thus, for the same order of 
complexity, the second alternative offers higher reliability due 
to inequality @. 

The cutoff rate criterion. One reason for considering the 
cutoff rate as a figure of merit for comparing the two cod- 
ing alternatives in Massey's example is due to its role in 
sequential decoding, which is a decoding algorithm for tree 
codes invented by Wozencraft [5], Sequential decoding can 
be used to achieve arbitrarily reliable communication on any 
DMC W at rates arbitrarily close to Rq(W) while keeping the 
average computation per decoded digit bounded by a constant 
that depends on the code rate, the channel W, but not on 
the desired level of reliability. Sequential decoding applied 
directly to the QEC can achieve i? (QEC) = 2 - log(l + 3e). 
If instead, one applies independent coding and sequential 
decoding on the component channels, one can achieve a sum 
rate of 2i? (BEC) = 2[l-log(l+e)], which exceeds i? (QEC) 



for all < e < 1, as shown in Fig. |2] The figure shows that 
Massey's method bridges the gap between the cutoff rate and 
the capacity of the QEC significantly. 

Apart from its significance in sequential decoding, the 
cutoff rate serves as a one-parameter gauge of the channel 
reliability exponent. Since Rq(W) is the vertical axis intercept 
of the E r (R,W) vs. R curve, i.e., R (W) = E r (0,W), an 
improvement in the cutoff rate is usually accompanied by 
an improvement in the entire random-coding exponent. For a 
more detailed justification of the use of cutoff rate as a figure 
of merit for a communication system, we refer to [6], [7]. 
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Fig. 2. Capacity and cutoff rate for the splitting of a QEC. 



B. Outline 

This paper addresses the following questions raised by 
Massey's example. Can any DMC be split in some way to 
achieve coding gains as measured by improvements in the ML 
reliability-complexity tradeoff or in the cutoff rate? And, if so, 
what are the limits of such gains? 

We address these questions in the framework of coding 
systems that consist of three elements: (i) channel combining, 
(ii) input relabeling, and (iii) channel splitting. In Massey's 
example there is no channel combining; a given channel is 
simply split into subchannels. However, in general, it turns out 
that it is advantageous to combine multiple copies of a given 
channel prior to splitting. Input relabeling exists in Massey's 
example: the inputs of the QEC which would normally be 
labeled as {0,1,2,3} are instead labeled as {00,01,10,11}. 
Channel splitting is achieved in Massey's example by complete 
separation of both the encoding and the decoding tasks on 
the subchannels. In this paper, we keep the condition that 
the encoders for the subchannels be independent but admit 
successive cancelation or multi-level type decoders where each 
decoder communicates its decision to the next decoder in a 
pre-fixed order. In this sense, our results have connections with 
Imai-Hirakawa multi-level coding scheme [8]. 



The main result of the paper is the demonstration of 
some very simple techniques by which significant cutoff rate 
improvements can be obtained for the BEC and the BSC 
(binary symmetric channel). The methods presented are readily 
applicable to a larger class of channels. 

II. Channel combining and splitting 

In order to seek gains as measured by the cutoff rate, we 
will consider DMCs of the form W : X n — » Z for some 
integer n > 2, obtained by combining n independent copies 
of a given DMC V : X — > y, as shown in Fig. [3] An essential 
element of the channel combining procedure is a bijective 
function / : X n — > X n that relabels the inputs of V®" 
(the channel that consists of n independent copies of V). The 
resulting channel is a DMC W : X n -> Z = y n such that 
W(z\u-L,...,u n ) = Hi=iV(yi\xi) where (x 1: . . . , x n ) = 
f(ui, . . . , u„), (ui, ... ,u n ) £ X n , z = (yi, . . . , y„) 6 Z. 




Synthesized channel W 
Fig. 3. Channel combining and input relabeling. 

We will regard W as an n-input multi-access channel where 
each input is encoded independently by a distinct user. The 
decoder in the system is a successive-cancelation type decoder 
where each decoder feeds its decision to the next decoder; 
and, there is only one pass in the algorithm. We will refer to 
such a coding system a multi-level coding system using the 
terminology of [8]. 
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Fig. 4. Channel splitting by multi-level coding. 

The multi-level coding system here is designed around a 
random code ensemble for channel W, specified by a random 
vector U = (Ui, . . . , U n ) ~ Qi(xi) • • • Q n {x n ) where Qi is 
a probability distribution on X, 1 < i < n. Intuitively, Ui 
corresponds to the input random variable that is transmitted at 



the ith input terminal. If we employ a sequential decoder that 
decodes the subchannels one at a time, applying successive 
cancellation between stages, the sum cutoff rate can be as 
high as 

i?o,s(U, Z) = R (Ux,Z) + ■■■ + R {U n , Z\U X ■ ■ ■ U n -i) 
where for any three random vectors (U, V, Z) ~ P{u, v, z) 



R (U,Z\V) ^- log J2P(v)J2 



J2P(u\v)VP(z\u,v) 



This sum cutoff rate is to be compared with the ordinary cutoff 
rate Rq(W) = maxQ Rq(Q, W) where the maximum is over 
all Q(ui, . . . , u n ), not necessarily in product-form. A coding 
gain is achieved if i?o,s(U, Z) is larger than Rq(W). Since 
Ro(W) = nRo(V) for all bijective label maps /, by the 
parallel-channels theorem mentioned earlier, we may compare 
the normalized sum cutoff rate 

Ro, s (V,Z) = -R Q .s(V,Z) 

n 

with Ro(V) to see if there is a coding gain. 

The general framework described above admits a method 
by Pinsker [9] that shows that if a sufficiently large number 
of copies of a DMC are combined, the sum cutoff rate can 
be made arbitrarily close to channel capacity. Unfortunately, 
the complexity of Pinsker's scheme grows exponentially with 
the number of channels combined. Although not practical, 
Pinsker's result is reassuring as far as the above method is con- 
cerned; and, the main question becomes one of understanding 
how fast the sum cutoff rate improves as one increases the 
number of channels combined. 

III. BEC AND BSC EXAMPLES 

The goal of this section is to illustrate the effectiveness of 
the abobe method by giving two examples, where appreciable 
improvements in the cutoff rate are obtained by combining 
just two copies of a given channel. 

Example 2 (BEC): Let V : X -> y be a BEC with 
alphabets X = {0, 1}, y = {0, 1,?}, and erasure probability 
e. Consider combining two independent copies of V to obtain 
a channel W : X 2 — > y 2 by means of the label map 

/ : (mi,u 2 ) -> (xi,x 2 ) = {ui ®u 2l u 2 ) 

where © denotes modulo-2 addition. Let the input variables 
be specified as (t/i,£/2) ~ Qi(ui)Q2("2) where Q±, Q 2 are 
uniform on {0, 1}. Then, we compute that 

MUuY^) = 1 - log(l + 2e - e 2 ) 
R {U2,Y 1 Y 2 \U 1 ) = l-log(l + e 2 ) 

An interpretation of these cutoff rates can be given by observ- 
ing that user l's channel, u\ — > (yi,y 2 ), is effectively a BEC 
with erasure probability 1 — (1 — e) 2 = 2e — e 2 ; an erasure 
occurs in this channel when either x\ or x 2 is erased. On the 
other hand, given that decoder 2 is supplied with the correct 
value of u\, the channel seen by user 2 is a BEC with erasure 
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Fig. 5. Cutoff rates for the splitting of BEC. 



probability e 2 ; an erasure occurs only when both x\ and X2 
are erased. The normalized sum cutoff rate under this scheme 
is given by 

Ro,s(UiU 2 , Y X Y 2 ) = 1 - i [log(l + 2e - e 2 ) + log(l + e 2 )] 

which is to be be compared with the ordinary cutoff rate of the 
BEC, Ro(V) = 1 — log(l + e). These cutoff rates are shown in 
Fig-13 The figure shows and it can be verified analytically that 
the above method improves the cutoff rate for all < e < 1. 

Example 3 (BSC): Let V : X -> y be a BSC with X = 
y = {0, 1} and crossover probability < e < 1/2. The cutoff 
rate of the BSC is given by 

iJo(V) = l-]Gg(l+7(e)) 

where 7 (rJ) := ^45(1 - 6) for < 5 < 1. 

We combine two copies of the BSC using the label map / : 
(iti, u 2 ) — > (xx, x 2 ) — (u\ © u 2 , u 2 ), and take input variables 
(Ui,U2) ~ Qx( x i)Q2(x 2 ) where Qx, Q2 are uniform on 
{0,1}. The cutoff rates R (Ux,YxY 2 ) and R (U 2 ,YxY 2 \Ux) 
can be obtained by direct calculation; however, it is instructive 
to obtain them by the following argument. The input and 
output variables of the channel W are related by yx = 
iti©it2©ei and y 2 = «2©e2 where ex and e 2 are independent 
noise terms, each taking the values and 1 with probabilities 
1 — e and e, respectively. Decoder 1 sees effectively the channel 
Ux — > Mi © ex © e 2 , which is a BSC with crossover probability 
€2 = 2e(l — e) and has cutoff rate 

Ro{Ux,YxY 2 ) =l-log(l+7(e 2 )) 

Decoder 2 sees the channel U2 — > (2/1,2/2) and receives ux 
from decoder 1, which is equivalent to the channel u 2 — > 

(2/1 © "1,2/2) = (u2 © ei,u 2 © e 2 ), which in turn is a BSC 



with diversity order 2 and has cutoff rate 

Ro{U2,YiY 2 \Ux) = l-log(l + 7 (e) 2 ) 

Thus, the normalized sum cutoff rate with this splitting scheme 
is given by 

Rv.s{UiU2, YxY 2 ) = 1 - i [log(l + 7 (e 2 )) + log(l + 7(e) 2 )] 
which is larger than Rq(V) for all < e < 0.5, as shown in 

Fig.ii 
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Fig. 6. Cutoff rates for the splitting of BSC. 

IV. Linear label maps 

This section builds on the method employed in the previous 
section by considering general types of linear input maps. 
Specifically, we consider combining n independent copies of 
a BSC using a linear label map x = uF where F is an 
invertible matrix of size n x n. The channel output is given 
by y = x + e where e is the noise vector. Throughout, we 
use an input ensemble U = (Ui, . . . , U n ) consisting of i.i.d. 
components, each component equally likely to take the values 
and 1. In the rest of this section, we give two methods that 
follow this general idea. 

A. Kronecker powers of a given labeling 

We consider here linear maps of the form F = A® k where 
A = [\x] lS me linear map used in Ex. The normalized 
sum cutoff rates for such F are listed in the following table for 
a BSC with error probability of e = 0.1. The cutoff rate and 
capacity of the same BSC are i?o = -3219 and C = .5310. 
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Ro,s 


.3670 


.4016 


.4245 


.4433 



The scheme with Fk has n = 2 fc subchannels and the size 
of the output alphabet of the combined channel equals 2 2 . 
The rapid growth of this number prevented computing i?o,s 
for k > 5. 



B. Label maps from block codes 

Let G = [P Ik ] be the generator matrix in systematic form 
of a (n, k) linear binary block code C. Here, P is a k x (n — k) 
matrix and Ik is the fc-dimensional identity matrix. A linear 
label map is obtained by setting 



F 



In-k 





P 





(3) 



Note that F^ 1 = F and that the first (n — k) columns of F 
equals H T , the tranpose of a parity-check matrix for C. Thus, 
when the receiver computes the vector v = yF^ 1 = yF, the 
first (n — k) coordinates of v have the form Vi = Ui © Sj, 
1 < i < n — k, where Sj is the ith element of the syndrome 
vector s = yH T = eH T . This ith "syndrome subchannel" 
is effectively the cascade of k BSCs (each with crossover 
probability e) where k is the number of l's in the ith row 
of H. The remaining subchannels, which we call "information 
subchannels," have the form Vi = Wi0e.;, (n — k + 1) < i < n. 

Example 4 (Dual of Golay code): Let F be as in with 

n = 23, k = 11, and 
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The code with the generator matrix G = [P In] is the dual 
of the Golay code [10, p. 119]. We computed the normalized 
sum cutoff rate i?o.s = -4503 at e = 0.1 for this scheme. The 
rate allocation vector (R (Ui;Y\Ux, . . . , f/j_i) : 1 < i < 23) 
is shown in Fig. Q There is a jump in the rate allocation 
vector in going from the syndrome subchannels to information 
subchannels, as expected. 

V. Concluding remarks 

We have presented a method for improving the sum cutoff 
rate of a given DMC based on channel combining and splitting. 
Although the method has been presented for some binary-input 
channels, it is readily applicable to a wider class of channels. 
Our starting point for studying this problem is rooted in the 
literature on methods to improve the cutoff rate in sequential 
decoding, most notably, Pinsker's [9] and Massey's [2] works; 
however, the method we proposed has many common elements 
with well-known coded-modulation techniques, namely, Imai 
and Hirakawa's [8] multi-level coding scheme and Unger- 
boeck's [11] set-partioning idea, which corresponds to the 
relabeling of inputs in our approach. In this connection, we 
should cite the paper by Wachsmann et al [12] which develops 
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Fig. 7. Rate allocation for Ex. HI 



design methods for coded modulation using the sum cutoff rate 
and random-coding exponent as figures of merit. 

Our main aim has been to explore the existence of practical 
schemes that boost the sum cutoff rate to near channel capac- 
ity. This goal remains only partially achieved. Further work is 
needed to understand if this is a realistic goal. 
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